Unsupervised Morpheme Analysis Evaluation by a Comparison to a Linguistic Gold Standard - Morpho Challenge 2008
نویسندگان
چکیده
The goal of Morpho Challenge 2008 was to find and evaluate unsupervised algorithms that provide morpheme analyses for words in different languages. Especially in morphologically complex languages, such as Finnish, Turkish and Arabic, morpheme analysis is important for lexical modeling of words in speech recognition, information retrieval and machine translation. The evaluation in Morpho Challenge competitions consisted of both a linguistic and an application oriented performance analysis. This paper describes an evaluation where the competition entries were compared to a linguistic morpheme analysis gold standard. Because the morpheme labels in an unsupervised analysis can be arbitrary, the evaluation is based on matching the morpheme-sharing words between the proposed and the gold standard analyses. In addition to Finnish, Turkish, German and English evaluations performed in Morpho Challenge 2007, the competition this year had an additional evaluation in Arabic. The results in 2008 show that although the level of precision and recall varies substantially between the tasks in different languages, the best methods seem to manage all the tested languages quite well. The Morpho Challenge was part of the EU Network of Excellence PASCAL Challenge Program and organized in collaboration with CLEF.
منابع مشابه
Unsupervised Morpheme Analysis Evaluation by a Comparison to a Linguistic Gold Standard - Morpho Challenge 2007
This paper presents the evaluation of Morpho Challenge Competition 1 (linguistic gold standard). The Competition 2 (information retrieval) is described in a companion paper. In Morpho Challenge 2007, the objective was to design statistical machine learning algorithms that discover which morphemes (smallest individually meaningful units of language) words consist of. Ideally, these are basic voc...
متن کاملUnsupervised Morpheme Analysis Evaluation by IR experiments - Morpho Challenge 2008
This paper presents the evaluation and results of Competition 2 (information retrieval experiments) in the Morpho Challenge 2008. Competition 1 (a comparison to linguistic gold standard) is described in a companion paper. In Morpho Challenge 2008 the goal was to search and evaluate unsupervised machine learning algorithms that provide morpheme analysis for words in different languages. The morp...
متن کاملUnsupervised Morpheme Analysis Evaluation by IR experiments - Morpho Challenge 2007
This paper presents the evaluation of Morpho Challenge Competition 2 (information retrieval). The Competition 1 (linguistic gold standard) is described in a companion paper. In Morpho Challenge 2007, the objective was to design statistical machine learning algorithms that discover which morphemes (smallest individually meaningful units of language) words consist of. Ideally, these are basic voc...
متن کاملAllomorfessor: Towards Unsupervised Morpheme Analysis
We extend the unsupervised morpheme segmentation method Morfessor Baseline to account for the linguistic phenomenon of allomorphy, where one morpheme has several different surface forms. Our method discovers common base forms for allomorphs from an unannotated corpus. We evaluate the method by participating in the Morpho Challenge 2008 competition 1, where inferred analyses are compared against...
متن کاملOverview of Morpho Challenge in CLEF 2007
Morpho Challenge 2007 contained an evaluation of unsupervised morpheme analysis algorithms using information retrieval experiments utilizing data available in CLEF. The objective of the challenge was to design statistical machine learning algorithms that discover which morphemes (smallest individually meaningful units of language) words consist of. Ideally, these are basic vocabulary units suit...
متن کامل